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0 Ownership interlock for cache data units. 

@ Insures data integrity in a data processing sys- 
tem by providing an ownership Interlock on the data 
units in a pipeline to a store-in type of cache. An 
ownership interlock prevents any processor owner- 
ship change to occur (i.e. exclusive or readonly 
ownership) for a cache data unit until all outstanding 
stores have been made in the cache data unit, after 
which the ownership may be changed. An ownership 
change may be signalled by a cross-invalidate (XI) 
signal to a processor. Outstanding stores are re- 
ceived by the pipeline after the stores are completed 
by a processor, and the outstanding stores output 
from the pipeline into a store-in cache. A continuous 
flow of stores is enabled into and out of the pipeline 
to expedite a change of ownership requested of a 
data unit in the cache. The continuous flow avoids 
having to stop a processor from putting stores into 
the pipeline and avoids forcing all outstanding stores 
out of the pipeline into the cache before indicating a 
change of processor ownership. 
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The invention relates to an ownership change 
control particularly according to the preamble of 
ciain^s 1 and 12. 

Any processor In a data processing system 
can be an exclusive owner of a data unit in the 
system storage hierarchy. Exclusive ownership of a 
data unit restricts to one of plural processors in the 
system the ability to write in the data unit and only 
one processor at a time can have exclusive owner- 
ship. The exclusive ownership of a data unit can be 
changed from one processor to another processor 
at the request of a processor, and the ownership 
can be changed from exclusive to public owner- 
ship, and visa-versa. Public ownership allows all 
processors to read, but not to write in. the data 
unit. The invention insures data integrity in a data 
processing system by providing an ownership in- 
terlock on the data units in a store-in type of cache. 
The ownership interlock prevents any change to 
occur in the exclusive ownership of a cache data 
unit until ail stores have been made in the cache 
data unit, and thereafter ownership may be 
changed. 

US patent application serial no. 679,900 (PO 
990 033), filed on 3 April 1991 and owned by the 
same assignee, has all of its content fully incor- 
porated herein by reference and is considered part 
of this specification. 

The store-in type of cache has been used in 
computer systems because it requires less band- 
width for its memory bus (between the memory 
and the cache) than is required by a store-through 
type of cache for the same frequency of processor 
accesses. Each cache location may be assigned to 
a processor request and receive a copy of a data 
unit fetched from system main memory or another 
cache in the system. With a store-in cache, a 
processor stores into a data uriit in a cache location 
without storing into the correspondingly addressed 
data unit in main memory, so that the cache loca- 
tion may become the only location in the system 
containing the latest version of that data unit. The 
processor may make as many stores (changes) in 
the data unit as its executing program requires. 
The integrity of data in the system requires that the 
latest version of the data unit be used for any 
subsequent processing of the data unit. Exclusive 
ownership (authority) of a data unit has been re- 
quired in prior store-in caches before allowing writ- 
ing in the data unit. 

A store-through type of cache is used only for 
fetching and all store accesses pass through it to 
the next level (another cache or main storage) in 
the system storage hierarchy. However, a store- 
through cache usually has stores performed in it as 
they pass through it, in order to maintain the latest 
version of data for obtaining the fastest fetching by 
its processor. 



Exclusive ownership (authority) to change a 
cache data unit is assigned to a processor before it 
is allowed to perfomn its first store operation in the 
data unit. The assignment of processor ownership 
5 has been controlled by setting an exclusive flag bit 
in a cache directory (sometimes called a tag direc- 
tory) associated with the respective data unit in the 
cache. The flag bit can be set to indicate either 
exclusive ownership or public ownership 

70 (sometimes called "read-only authority"). Exclusive 
ownership by a processor aHows only it to write 
into the data unit. The public (read-only) ownership 
of a data unit does not allow any processor to store 
into that data unit, but allows each processor in the 

75 system to read that data unit which is then shara- 
ble by all processors. 

USA patent 4.394,731 (PO9-80-016) to Rusche 
et al teaches the use of exclusive/readonly flags in 
private processor directories used with private 

20 store-in caches and teaches the use of copy direc- 
tories for processor identification. Patent 4,394,731 
used copies of all processor private LI directories 
for identifying processor ownership and for control- 
ling changes in the ownership of a data unit. Cross- 

25 interrogation was used among the copy directories 
to identify which processor had exclusive owner- 
ship of a data unit, and cross-invalidation was used 
from any identified processor's copy directory to its 
LI cache to invalidate its conflicting address to 

30 assure exclusivity to a requesting processor, when 
changing the ownership from exclusive to public 
readonly ownership, or visa versa. 

A store-in cache updates (writes in) a cache 
data unit which has its old version located at an 

35 associated address in main memory. When the 
updated data unit is no longer needed in the cache, 
it is castout of the cache by writing the updated 
cache version over the old version of the data unit 
at the associated address in main memory. The 

40 cast-out operation is done when an updated data 
unit is in a cache location which is to be real- 
located to another data unit (e.g. fetched from 
another main memory address). For example, a 
processor may request to store into a data unit not 

45 cunrently in the cache. Then the requested data 
unit must be fetched from main memory (or from 
another cache) using the requested address and 
stored in a newly assigned cache location. The 
cache assignment of a location for the new data 

50 unit will be in a cache location not in current use if 
one can be found. However, only a limited number 
of cache locations exist, and all may currently 
contain updated data units. If all the assignable 
cache locations are currently occupied with 

55 changed data units, then one of them must be 
reassigned for the new request for a data unit not 
cunrently in the cache. Then a castout to main 
memory is required of the updated cache data unit 
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before the reassigned cache location can t>e made 
available for use by the new request. The castout 
process is an example of a change of ownership in 
a data unit, because the castout data unit has its 
ownership changed from an exclusive processor 
ownership to a main memory ownership. 

This problem is not generally applicable to a 
store-through type of cache, since any stores made 
in it will also have been made in its tracking mem- 
ory, which may be another cache (store-in or store- 
through) or may be main memory. 

A change in the ownership of any data unit is 
controlled by the processor request process in a 
system. Only one of the plural processors in a 
multiprocessing (MP) system can have exclusive 
ownership (write authority) at any one time over 
any data unit. The exclusive ownership over any 
data unit may be changed from one processor to 
another when a different processor requests exclu- 
sive ownership. The prior mechanism for indicating 
exclusive ownership for a processor was to provide 
an exclusive (EX) flag bit in each LI directory entry 
in a processor's private LI cache; and the EX bit 
was set on to indicate which of the associated data 
units were "owned" by that processor. The reset 
state of the EX flag bit indicated public ownership, 
which was called "readonly authority" for the asso- 
ciated data unit that made it simultaneously avail- 
able to all processors in the system. Thus, each 
valid data unit in any processor's private LI cache 
had either exclusive ownership or public ownership. 

There are many types of interiock controls in 
the prior art. One type of prior interlock control 
requires a castout for a changed cache data unit 
from a store-in cache to main storage to occur 
before a new data unit may be represented by the 
same cache directory entry, which will be over- 
layed for the new entry. Vtfhether the data unit is 
changed has been indicated by a change flag bit in 
an accessed cache directory entry (indicating its 
associated data unit has been changed). 

The invention deals with a high-speed pipelin- 
ed computer system in which multiple machine 
cycles of delay intervenes between the time a store 
command is generated by a processor and the 
time its store is made in a target cache data unit. 
Such a delayed store command is called an 
"outstanding store" or a "pending store" during its 
flight time from its generation until it is stored in its 
targeted data unit in a store-in cache. 

This invention requires that all outstanding 
changes be made in a data unit by a processor 
exclusively owning the data unit in a store-in-cache 
Ijefore the ownership of the data unit can be 
changed to a different processor. Outstanding 
stores are caused by a store command pipeline 
provided between a processor and the cache to 
buffer stores in a manner that improves the effi- 



ciency of processor operation, such as by freeing 
the processor to do other processing as soon as it 
generates each store command. 

The object of the invention is to provide an 
5 ownership interlock that prevents changes in the 
ownership of a data unit in a store-in-cache until all 
outstanding stores have been made in the cache 
data unit. 

The solution is described in the characterizing 

10 part of claim 1, 12 and 13. 

This invention aids system efficiency by per- 
mitting a pipelined store stack to receive store 
requests from a processor in a continuous manner. 
Without this invention, the processor would need to 

15 stop sending store commands to the store stack 
when the processor receives an XI signal (for in- 
validating any XI addressed entry in its LI cache 
directory) until all outstanding store commands 
then in the stack are completed in the cache to 

20 assure the integrity of data in the system. Such 
stoppage of a processor's store operations upon 
each received XI signal would reduce the rate at 
which stores are generated in the system and the 
rate stores could be received by an L2 cache, with 

25 a resulting significant loss in system efficiency. 

Processor ownership over a data unit is consid- 
ered to change: 1. when the requested data unit is 
found in a cache location which needs to be reas- 
signed and have its ownership changed to the 

30 requesting processor in the c^che directory; or 2. 
when the requested data unit is not found in the 
cache and a cache location containing a changed 
data unit is reassigned to the requested data unit, 
so that the changed data unit must be castout 

35 tjefore the requested data unit is fetched into the 
same cache location, thereby changing the owner- 
ship of lx>th the castout data unit and the re- 
quested data unit. 

The invention may be used with different types 

40 of ownership Indications for each data unit in a 
multiple processor system. Ownership may be ex- 
pressed in a number of different ways, such as by 
the use of a CPU identifier (CPID) field in each 
directory entry to identify which of plural CPUs 

45 owns the associated data unit exclusively or wheth- 
er the data unit is owned publicly by all CPUs. Or 
CPU ownership may be indicated by copies of 
CPU private LI directories which are cross-interro- 
gated by all CPU requests in the system to deter- 
so mine which CPU exclusively owns the requested 
data unit (by its copy directory indicating its exclu- 
sive ownership, or indicating the requested data 
unit is publicly owned). The CPID ownership-in- 
dicating method centralizes the system coherence 

55 control in a single shared directory which is not 
done in the copy directory method. 

A cache data unit can have its ownership trans- 
fenred from a currentiy owning processor to a re- 
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questing processor when the rules of ownership 
change are followed. When CPID is used In a 
single system directory, only that CPID field needs 
to be changed. But when copy directories are used 
to indicate ownership, a requested data unit has to 
be moved from one CPU's L1 cache. LI directory 
and LI copy directory (where the data unit is 
found) to the requesting CPU's LI cache. LI direc- 
tory and copy directory. 

These different data unit ownership methods 
may be used in a multiple processor system using 
only private CPU LI caches and having a shared 
single system directory, or they may be used in a 
muHiple processor system using private CPU L1 
caches and a shared L2 cache having the shared 
single system directory. Both of these methods 
require the use of a change field in each directory 
entry of a cache to indicate if the associated data 
unit has been changed. 

The preferred emt)odiment uses the CPID 
ownership-indicating method in a system using an 
L2 store-in-cache shared by a plurality of CPUs 
having private LI store-through caches. The 12 
cache uses hardware in the storage control ele- 
ment. SCE. to send a spxecific cross-invalidate (XI) 
signal to the current exclusiv&owning processor 
indicated by the cunrent CPID field in the L2 entry 
for changing the exclusive-ownership of a data unit. 
The XI receiving processor must provide an XI 
response to determine when all stores must be 
completed in the accessed L2 data unit before its 
CPID can be changed in the L2 directory entry. A 
store command may be made to any L2 entry 
currently indicating exclusive ownership by the 
CPU, and the store is made concurrently in both 
the requested LI cache and the L2 cache, although 
it takes longer to make the store in the L2 cache 
than the LI cache because of a pipelined store 
stack in the SCE for stacking plural store com- 
mands from each processor. Although the store 
stack delays making the stores in L2, it imme- 
diately frees up the processor so it can do another 
operation. 

If the cunrent CPID indicates a public owner- 
ship and the new request also wants public owner- 
ship of the same data unit, then no XI signalling is 
done and the L2 entry is not modified for the new 
request. 

But if the current CPID indicates a public own- 
ership, and a new request for the data unit wants 
exclusive ownership, then a general XI signal is 
sent to all CPUs having the publicly owned unit. No 
XI response back to the SCE is provided from the 
CPU receiving the general XI signal, and each CPU 
containing the XI addressed data unit of any XI 
signal invalidates it in its LI cache. Then the L2 
directory entry can have its CPID immediately set 
to the requesting CPU's exclusive CPID to change 



the ownership of its data unit from public to exclu- 
sive. Accordingly, no waiting period is needed for 
any response to a general XI signal from any CPU. 
as is the case with a specific XI signal. 

5 A specific XI signal to the CPU requires the 

CPU to give up ownership of the XI addressed data 
unit. However, it does not require the CPU to give 
up ownership instantly. The CPU can finish up any 
required operations to that data unit before giving 

10 up ownership and sending an XI response. 

A CPU presumes it has given up ownership of 
an L2 cache location at the time it sends an XI 
response signal. However, one or more of the 
CPU's outstanding stores to the XI addressed data 

75 unit may not yet have been made in the L2 cache, 
because these stores may still be in the pipeline, in 
a store queue, or in the stack, which delays the 
outstanding stores from being made immediately in 
the cache. 

20 The outstanding stores in the store stack must 

be received by the intended cache data unit before 
its ownership is allowed to change. Data integrity in 
the system would be adversely affected if the own- 
ership of a data unit were allowed to change before 

25 any outstanding stores in the stack were made in 
the data unit, because then the data unit may not 
have its latest value when it is fetched by a new 
owner. 

Thus, before a reassignment of ownership to a 
30 cache data unit can be allowed, all outstanding 
stores in the store stack must be completed to the 
data unit addressed by the CPU which issued the 
stores, and that CPU must remain responsible for 
all changes it made up to the time it issued its XI 
35 response signal to indicate the precise point in its 
program execution where it signalled the termina- 
tion its ability to make further data changes in that 
data unit. 

TTiis problem may occur with any store-in 
40 cache operating with pipelined processing between 
a CPU and a cache that causes a delay to stores 
being made in the cache after the CPU presumes it 
has ended its exclusive control over a cache loca- 
tion. Thus, the problem can occur with a CPU 
45 private cache (LI) when its stores are delayed by a 
pipeline operation, such as by having a pipelined 
input store queue. And this problem can occur with 
a store-in cache shared by a plurality of CPUs and 
is particularly pronounced in a shared L2 store-in 
50 cache operating with plural store-through L1 
caches. 

For example in an L2 shared cache, a CPU 
may be storing in a location in the L2 cache 
assigned to a first main memory address, when the 
55 cache location is reassigned to a different main 
memory location by the L2 replacement LRU con- 
trols. If the data unit had been changed in the 
reassigned cache location, that data unit needs to 
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be castout to main memory (L3) to update its 
associated main memory location before it can be 
overlayed by newly requested data from a different 
main memory address. But that data unit cannot be 
cast-out until it is has completed storing all out- 
standing store commands issued to it before its 
CPU provided the XI response, which stores are 
still in the pipelined stack. 

This invention aids system efficiency by per- 
mitting the store stack to receive input requests in 
a continuous manner. Without this invention, a CPU 
would need to stop sending store commands to its 
store stack when it provides an XI response until all 
outstanding stores then in the stack are made in 
the L2 cache in order to assure the integrity of 
system data. Such stoppage of the store stacks 
with each XI signal would reduce the rate at which 
stores would be received by the L2 cache, with a 
resulting significant loss In system efficiency. 
Rg. 1 

represents a data processing system containing 

the invention. 

Rg.2 

represents the form of an L2 directory entry in 

the 12 cache shown in Rg. 1 . 

Rg.3 

represents tiie form of an LI directory entry In 
each LI cache shown in Rg. 1. 
Rg. 4 

represents CPU hardware in tiie system of Rg. 
1 used in a preferred emtXKliment of the inven- 
tion. 
Rg. 5 

represents SCE (storage control element) hard- 
ware in the system of Rg. 1 used in a prefenred 
embodiment of the invention. 
Rg. 6. Rg. 7 and Rg. 8 

provide flow diagrams of a process that op- 
erates on the hardware shown in Rgs. 1 through 
5 for performing the preferred embodiment of 
the invention. 
Rg. 1 represents a multiprocessor system 
(MP) containing central processing units (CPUs) 1 - 
N in which each CPU contains at least one private 
cache and preferably has two private caches, an 
instruction cache and a data cache. Only the data 
cache can receive stores, and hence is the cache 
of concern to the subject invention. The instruction 
cache is readonly. 

The CPU accesses its instructions from its 
instruction cache and accesses its operand data 
from its data cache. Both the data cache and 
insfruction cache are used for fetching a data unit 
requested by their CPU. If a CPU fetch request 
does not find a requested data unifs address re- 
presentation in a CPU's L1 cache directory, the LI 
cache has a "miss", and the requested address Is 
sent to a shared system cache (L2) to fetch the 



requested data unit. 

Since the subject Invention is concemed with 
store type accesses, the readonly instruction cache 
Is ignored in the following discussion. Each L1 data 
5 cache Is a store-through type of cache, and here- 
after it is referred to as each CPU's L1 cache. If an 
Instruction is to t>e stored into, it is done only in tfie 
instruction's data unit in the L2 cache, and then 
that data unit is fetched into the requesting instruc- 
10 tion cache as a readonly data unit. 

12 requests comprise all Li fetch misses and 
all I/O requests. If an L2 request is not found in the 
12 cache, then the 12 cache has a "miss", and the 
requested address is sent to system main storage 
75 (L3). from which the requested data unit is fetched 
and is sent on the memory bus to the L2 cache, 
and the LI data unit is sent to the requesting LI 
cache generating the request The data unit for the 
L1 cache need not be the same size as the data 
20 unit in the 12 cache which contains the LI data 
unit Thus each L1 data unit may be sub-multiple 
of an L2 data unit, or they may be the same size. 

All CPU stores are made in L2 (as well as in 
LI). But stores are not requests to 12 but are 
25 handled as store commands to the caches. The 
reason is that all store commands are preceded by 
an L2 fetch request for obtaining the required data 
unit in both the LI and L2 caches. Once the data 
unit exists in the caches, commands to store ac- 
30 compiish the store operation. 

The L2 directory contains an input priority cir- 
cuit tiiat receives all requests to the L2 cache, i.e. 
for all CPUs and all I/O devices. The priority circuit 
selects one request at a time for accessing in the 
05 12 cache directory. A high-order field in the se- 
lected request selects a row (congruence class) in 
the L2 directory (not shown) and a comparison with 
an address portion finds any assigned cache direc- 
tory entry and associated cache data unit location. 
40 as is conventionally done in set associative caches 
so these cache contained items are not shown 
herein. Each L1 and L2 cache herein is presumed 
to be a 4-way set associative cache. 

Each L2 directory entry contains the fields 
45 shown in Rg. 2. and each LI directory entry con- 
tains the fields shown in Rg. 3. Each L2 entry 
contains a CPU identifier (CPID) field (e.g. ttiree 
bits) which are combinatorially set to a value (e.g. 1 
to 6) that can identify one CPU in the MP which is 
50 the current exclusive owner of the corresponding 
data unit in the L2 cache. A zero value in the CPID 
field indicates a public ownership for the corre- 
sponding L2 data unit. 

If a requested address is not found in the 
55 addressed row in the L2 directory, a conventional 
LRU replacement circuit (not shown) allocates a 
replacement entry for each congruence class in 
which it candidates one of the four enfries as the 
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next entry in the congruence class for allocation to 
a requested data unit that must t>e fetched from L3 
memory. Generally, the candidate entry is a cur- 
rently invalid entry, but if there are no invalid 
entries, it selects the LRU entry of the four entries. 

Before a requested data unit can be obtained 
from L3 and stored into the cache slot associated 
with a newly allocated L2 entry (the associated slot 
in a cache data array), any old data unit existing In 
that slot (represented by the current content of the 
12 directory entry) must be checked in the direc- 
tory entry to determine if it has changed data. This 
Is done by checking the state of a change field 0*e- 
change bit) in the contents of the L2 entry before 
the entry is changed to represent the newly re- 
quested data unit. If the old data unit has been 
changed (as indicated by its CHG bit), it is the 
latest version of the old data unit which must be 
castout to the same address in main memory t)e- 
fore the newly requested data unit can be stored in 
the associated kx:ation in the cache. 

Thus, Rg. 1 generally illustrates a multiproces- 
sor (MP) computer system which may contain the 
subject invention. It includes N number of CPUs 
each having a private store-through cache (L1) with 
its LI cache directory. Each CPU accesses storage 
fetch requests in its LI cache as long as it obtains 
cache hits Indicating the requested data is available 
in its LI cache. 

However, sometimes requested data Is not 
available in its LI cache, and the cache then sig- 
nals a LI cache miss to the L2 cache. The fetch 
request is sent to the next level in the system 
storage hierarchy, which is the L2 cache in Rg. 1, 
to fetch the requested data unit, and is put into a 
request register. REQ 1 - REQ N, associated with 
the requesting CPU. Tlie CPU request also in- 
dicates the type of ownership which is being re- 
quested of the data unit to be fetched, which may 
be either exclusive or readonly. 

After a data unit has been fetched into CPU's 
L1 cache from the L2 cache, the CPU may make 
store commands for storing data into the data unit. 
A store command usually does not oven^vrite the 
entire data unit in either the LI or L2 cache, but 
writes only changed byte(s) into the data unit 
(which may, for example, contain dozens of bytes). 
This manner of writing into a data unit is well 
known in the art, using mark bits in the store 
command to represent the parts of a data unit to 
be changed by a given store command. 

Also, an I/O request register, REQ K. receives 
all input and output (1/0) device requests to mem- 
ory. An I/O request accesses the 12 cache since 
the latest version of a data unit may reside in the 
L2 cache, where it may be changed by the I/O 
request. If the I/O request is not in L2. it is then 
accessed in the L3 main memory without acces- 



sing the data unit into the L2 cache. 

REQ 1 - REQ K present their contained re- 
quests to the input priority circuit of the 12 shared 
cache. The presented requests are sequenced by 

5 the priority circuit, which presents one request at a 
time, to the L2 cache directory for accessing on a 
machine cycle or sut)cycle basis. 

Rgs. 4 and 5 show the hardware pipeline for 
an embodiment of the invention contained in each 

w of the CPUs and the SCE shown in Rg. 1. The 
store pipeline in Rgs. 4 and 5 connects the stores 
from any CPU to the shared 12 cache. The no- 
menclature CPx is used in Rgs. 4 and 5 to des- 
ignate any of the N number of CPUs that is cur- 

75 rently receiving an XI signal from the SCE. 

Each CPU store command causes storing in 
both the respective CPU's LI cache and in the 
shared L2 cache. The manner of storing in LI may 
be conventional. Fig. 4 shows a store queue 26 

20 which receives the store commands from its CPx in 
FIFO order, and sends them to a store stack 27 
(located in the SCE. which is the L2 cache and L3 
main memory controller) which is in Rg. 5. The 
stack outputs its oldest store command to the L2 

25 priority circuit for accessing in the L2 directory and 
L2 cache. Each store command in the store queue 
26 and store stack 27 contains both the address 
and the data for a single store operation. 

The FIFO order of handling store commands in 

30 stack 27 is maintained by inpointer and outpointer 
registers. INPTR & OUTPTR. INPTR locates tiie 
current entry in the stack for receiving the next 
store from queue 26. OUTPTR locates the oldest 
store in stack 27 to be outputted to the L2 cache. 

55 INPTR is incremented each time a store is re- 
ceived in the cunrent inpointer location, and 
OUTPTR is Incremented each time a store is out- 
putted from the stack. Both the INPTR and 
OUTPTR wrap in the stack so tfiat the slack never 

40 runs out of space for a next entry. This type of 
stack pointer control is conventional. 

The CPz, CORn or lOy request command reg- 
isters 1z, In or 1y respectively receive the LI CPU 
fetch requests, L2 cache LRU replacement re- 

45 quests and 1/0 device requests for accesses in the 
L2 cache. Each request command (i.e. requestor) 
puts into a request register the main memory ad- 
dress (or a representation thereof) of the requested 
data unit and the requested type of ownership (EX 

50 or RO). The registers 1z, 1n and 1y represent 
different types of request registers, of which only 
one register is doing a request into the L2 cache at 
any one time in the emtiodiment. One of these 
registers is selected at a time by the L2 priority 

55 circuit for a current access cycle for accessing an 
entry in the L2 directory and its associated cache 
slot that contains the associated data unit. 

Thus CPz request register 1z represents any 



6 



11 



EP 0 507 066 A1 



12 



12 request register that receives any CPU request 
to L2. The sul)script z indicates the CPU is a 
requesting CPU, while the sutiscript x is used here- 
in to indicate any CPU which is receiving an XI 
signal. 

The CORn (castout) register In represents any 
of plural castout request registers that receives a 
cunrent castout request for L2. The subscript n 
indicates the assigned register of the plural castout 
registers assigned by an LRU replacement circuit 
for 12 (not shown) to receive the castout address. 
Replacement of the content of an L2 entry may t>e 
done in the conventional manner when a CPU 
request does not hit (i.e. misses) In the 12 direc- 
tory. 

The lOy register 1y represents any of plural 
registers that Is selected by the 12 priority as its 
current request to the L2 directory. Only I/O re- 
quests that hit in 12 are used by this emt)odiment; 
an I/O request that does not hit (lb, misses in the 
12 directory) is not fetched into L2, but Is then 
accessed in the L3 main memory in the conven- 
tional manner. 

Whichever of the registers 1z, In or 1y is 
currentiy selected has its address provided to com- 
parators 28. And all addresses in stack 27 are 
provided in parallel to comparison circuits 28 which 
simultaneously compare all contained stack com- 
mand addresses with the currentiy selected re- 
quest address CPz. CORn or lOy being provided to 
the L2 cache. 

An access 2 in the SCE tests the value of the 
CPID field in the currentiy accessed L2 directory 
entry in the detailed emljodiment. If circuit 2 de- 
tects the tested CPID value is in the range of 1-6. it 
indicates an EX ownership by tiie identified CPU. 
But if the tested CPID is zero, access 2 has de- 
tected a public RO ownership for the data unit 
represented by currentiy selected L2 entry. 

If exclusive ownership is detected by access 2, 
it invokes the generation of a specific cross-Invali- 
date (XI) signal which is sent only to the one CPx 
identified by the tested CPID. A detected CPID 
value of from 1 to 6 in this embodiment Indicates 
the one CPU in the system having exclusive own- 
ership of the data unit associated with the currentiy 
selected L2 directory entry. A detected value of 
zero for the CPID indicates tiiat data unit has public 
ownership and is therefore is readonly. K public 
ownership Is detected by access 2, it invokes the 
generation of a general XI signal which is sent to 
all CPUs except tiie requesting CPU. 

The specific XI signal initiated by access 2 is 
sent only to the CPU identified by ttie CPID in the 
12 directory entry. The specific XI signal includes 
the main memory address (or a representation 
thereof) for the affected data unit in the receiving 
processor's cache, an XI type indicator (specific or 



general), and an identifier (ID TAG) for this L2 
request command (requestor) so that the SCE can 
determine which requestor is responsible for a re- 
ceived XI response. The specific XI type indicator 

5 also indicates whether the addressed data unit is to 
be invalidated or changed to public ownership. In 
the SCE, the sending of a specific XI signal sets an 
"XI response wait mode" latch 8 to "XI wait 
mode". The XI wait, caused by a specific XI signal, 

10 is ended when the SCE receives the XI response 
from the XI requestor that sent the XI signal getting 
the XI response. 

The general XI signal initiated by access 2 is 
sent to all CPUs except the requesting CPU, and is 

75 put into all of the respective XI queues. The receiv- 
ing CPUs will invalidate the XI addressed data unit, 
if It exists in the LI cache, and does not provide 
any XI response. 

As soon as any XI signal is sent for any re- 

20 questor, the SCE can Immediately service its next 
requestor, because the XI ID tag will allow correla- 
tion of each XI response with Its requestor by the 
use of the requestor's ID tag. 

A specific XI signal received by any CPx re- 

25 quires tiiat CPU to stop sending stores to that XI 
addressed data unit, and give up exclusive owner- 
ship. However, the CPU can finish up any required 
operations to that data unit before giving up owner- 
ship. When the CPU reaches a point where it can 

30 give up ownership (this does, not necessarily mean 
all store commands in store queue 26 to the XI 
addressed data unit are done), it outgates the XI 
signal ft-om the XI queue 21. The XI queue 21 
gates the Invalidation addresses with the XI signal 

35 to a compare circuit 22 that compares the XI invali- 
dation address in parallel with all addresses cur- 
rentiy In the CPx store queue 26 and generates a 
compare or no compare signal. The XI invalidation 
address is also used to Invalidate any entry in the 

40 CPx LI cache equal to the XI invalidation address. 

If circuit 22 provides a compare equal signal, it 
activates an "update queue" circuit 23 which stops 
store queue 26 from sending any store commands 
to the XI addressed data unit (stores to other data 

45 units may continue) and updates store queue 26 to 
mark those store command(s) to the XI addressed 
data unit. The "update queue" circuit 23 also ac- 
tivates an "XI response" circuit 24 to send an XI 
response signal to the SCE where It resets the "XI 

50 response wait mode" latch 8 to terminate the XI 
wait mode in the SCE. 

If ttiere are any marked store commands in 
store queue 26, they will start a process that will 
re-acquire exclusive ownership to that data unit (by 

55 sending a fetch exclusive command to the SCE). 
When exclusive ownership is re-obtained for the 
data unit the marked store commands are unmar- 
ked and they t>ecome eligible to be sent to the 
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store stack 27. 

If circuit 22 provides a no compare signal on 
Its output G, it indicates there are no store com- 
mands in store queue 26 for the XI addressed data 
unit, and output signal G activates the "XI re- 
sponse" circuit 24 to send an XI response signal to 
the SCE where it resets the "XI response wait 
mode" latch 8 to terminate the XI wait mode in the 
SCE. 

The reset of wait mode circuit 8 causes it to 
output a wait mode termination signal which gates 
comparator 28 to compare the current L2 request 
address with all addresses currently in the the CPx 
store stack 27 using a single cycle parallel com- 
pare operation. A compare-equal (cmpr) signal 
from circuit 28 to an AND gate 29 inputs the 
content of an INPTR register into a capture INPTR 
register 10 that captures the current location in 
stack 27 available for an input of a current CPU 
store commaruJ. Once the INPTR value is cap- 
tured, CPx can continue to send store commands 
to the store stack which will change the INPTR 
value but not the captured INPTR 10. The captured 
INPTR value indicates the last location in the CPx 
store stack 27 which may contain the last store 
command from CPx for the requested data unit 
and the OUTPTR value indicates the CPx store 
stack location having the oldest store from CPx. 
The OUTPTR value is being continuously incre- 
mented to continuously output its store command 
entries to update the L2 cache entries. The incre- 
menting of OUTPTR will cause its contained point- 
er address to wrap around and finally become 
equal to the captured INPTR value. 

The captured INPTR value is provided to a 
pointer comparison circuit 38 which compares it 
witti the stack OUTPTR value as the OUTPTR is 
incremented to output the store commands to the 
L2 cache. As long as the OUTPTR does not com- 
pare equal with the INPTR, an output signal D is 
provided from pointer compare circuit 15 to set the 
"store done mode" latch 13 to indicate that the 
store stack outputting is not yet done. When the 
OUTPTR finally compares equal with the INPTR, 
an output signal E is provided from circuit 15 to 
reset the "store done mode" latch 13 to indicate 
that all possible store commands have been out- 
putted from stack 27 into the cache. 

When access 2 has found that public owner- 
ship exists for the currentiy accessed 12 entry, a 
current CPz request is detected by circuit 4 to 
determine if it wants exclusive or public ownership. 
If CPz wants exclusive ownership then general Xi 
signalling is required to all other CPUs that contain 
that data unit. (However if the CPUs containing the 
data unit are specifically known, the general XI 
signalling need only be sent to them without being 
sent to the CPUs known not to contain the data 



unit). If CPz wants public ownership then no XI 
signalling is required. 

Alt lOy requests are handled by access 2 
merely sending a general XI invalidate signal, 
5 which prevents any CPU from interfering with any 
I/O access in the L2 cache. 

Thus, the general XI signal from access 2 is 
used when there there is no need for any XI 
response from any of tiie plural CPUs which may 
10 contain the data unit, since none can be doing 
store commands and all that is needed is LI Invali- 
dation. 

If a "no compare" output should be provided 
by stack compare circuits 12 (indicating no store 

15 commands from CPx exist in the stack) or the 
public ownership RO indication 6 exists from circuit 
2 in the currentiy accessed 12 directory entry, the 
access operations represented by boxes 7, 16 and 
20 are used. In all of these cases, there are no 

20 outstanding stores because the public ownership of 
the current data unit prevents stores from happen- 
ing. 

The change field in an accessed public entry is 
detected only for a CORn request because it is 

25 needed for castout control. For CPz arwl lOy re- 
quests, no castout is done but instead the acces- 
sed entry is transferred to CPz and lOy requests 
regardless of the state of the change field which 
therefore is not detected. Hence, change access 

30 circuit 7 detects ttie change field in the current 
directory entry only for a CORn request, change 
access circuit 7 is not used for a CPz or lOy 
request. 

But if for a CORn request, change access 

35 circuit 7 finds the change field indicates no change, 
then there is no need for a castout (since the data 
unit is the same in main memory L3). and the 
directory entry update means 20 can immediately 
update the directory entry by overlaying its content 

40 with information from the CPz request that caused 
the respective CORn request. 

Thus, for a CORn request, if change access 
circuit 7 detects the change bit is set on in the 
current directory entry, data unit access 16 is 

45 needed to access the updated associated data unit 
from the cache data arrays for the request, i.e. a 
switch 17 sends the data unit (castout) to the next 
storage level L3. For a CPz or lOy request, access 
16 can immediately obtain the associated data unit 

50 from the cache data arrays for the request, i.e. a 
switch 18 sends the data unit to CPz for a CPU 
request, and switch 19 sends tiie data unit to the 
requesting channel lOy. 

Directory entry update means 20 is immedi- 

55 ately used for a CORn request that finds no change 
in the associated data unit. But if the directory 
entry update means 20 is being used for a CPz 
request, then the update of directory entry content 



8 



15 



EP0 507 066 A1 



16 



by means 20 is delayed until after the castout has 
been completed (for system recovery reasons the 
initial content of the entry may be needed if a 
system failure should occur before the castout is 
completed). 5 

The timing delay for the cache data access 16 
is controlled by the output F from the "store done" 
latch 13 when it is reset by a compare-equal signal 
E from PTR a compare circuit 15 (when the INPTR 
and OUTPTR are equal). All CPz store command io 
entries to the requested data unit in stack 27 will 
have been flushed out to the cache when circuit 15 
signals its output signal E, since then the OUTPTR 
will have revolved back to the captured INPTR 
starting point for the stack output operation, and is 
then cache data access 16 may be initiated. 

Process Operations in Rgs. 6. 7 and 8 

The reference numbers in Figs. 6, 7 and 8 are so 
functionally related to the reference numbers used 
in Figs. 4 and 5. wherein 100 has been added to 
the latter reference numbers to generate the former 
reference numl)ers. The following steps in the nov- 
el process disclosed herein also are related to the 25 
reference numbers in Rgs. 4 and 5. 

(101) A Requestor (address in castout regis- 
ter 1n, CPU fetch request register 1z 
or 10 fetch request register lOy) is 
valid and selected by the L2 priority for 30 
the cunrent L2 cycle. 

(102) The directory entry accessed by the 
requestor contains the ownership in- 
formation is accessed to determine if 
data is held exclusive or read-only. 35 

(103) The data is found to be held exclu- 
sively by a CPx. and therefore CPx 
must be sent ah specific XI signal that 
its exclusive ownership is being rescin- 
ded. This cross-interrogatfon signalling 4o 
also includes the identity (ID) of the 
Requestor which is returned to the 
SCE when CPx issues its completion 
response (its XI response). 

(104) The directory entry for the currently 45 
accessed L2 data unit indicates it is 
publicly owned and if CPz did not re- 
quest exclusive ownership, no XI sig- 
nalling is needed. See Fig. 6. 

(105) When a CPz requests exclusive owner- so 
ship of a data unit, and it is found to 

be publicly owned among the CPUs, 
the SCE sends a general XI signal to 
ail CPUs (excluding CPz) to invalidate 
all L1 copies of the data. Since the 55 
requested data unit is publicly owned, 
there is no need for the requesting 
CPz to wait for an XI response beiore 



accessing the data unit, since no out- 
standing stores exist In the stack. The 
CPz requestor then gets exclusive 
ownership of its fetched data unit. 

(106) For CORn castout requests of a public 
data unit, its directory change flag field 
is checked to determine if a castout is 
really needed to preserve the latest 
version of the data unit in main mem- 
ory since the data unit in the asso- 
ciated cache slot will be destroyed, but 
if it has not been changed it already 
has an exact copy in main memory. 
For CPz and lOy exclusive fetch re- 
quest hits to a public data unit, the 
change field does not need to be 
checked since the associated data unit 
(whether changed or not) only has its 
ownership changed by the current re- 
quest and is not otherwise affected. 

(107) Here, a CORn request has its change 
field checked for a public data unit. 
Only if the associated data unit is 
changed, is a castout required. 

(108) At this time if the L2 cache is waiting 
for an XI response from the specified 
CPx, because it has received a spe- 
cific XI request. The XI response wait 
mode latch is set by a specific XI 
signal, but it is not set by a general XI 
signal because it does not generate an 
Xi response. 

(109) CPx issues an XI response to its re- 
ceived XI signal. In responding, CPx 
indicates it will not provide any more 
stores to the currently accessed L2 
data unit, and CPx has given up its 
exclusive ownership to that data unit. A 
requestor ID TAG is returned with its 
XI response which is then decoded in 
the SCE and sent to the proper re- 
questor (CPz. lOy or CORn request 
register). 

(110) Upon reception of the CPx response, 
the value of the store stack inpointer 
(INPTR) identifying the most recent 
store command entry from CPx is cap- 
tured (which is tiie last possible store 
command from CPx for the currentiy 
accessed L2 data unit) for possible use 
by the requestor. 

(111) A parallel address compare is done 
between the requestor address and the 
addresses in all store commands in the 
CPx store stack to determine if no CPx 
store command exists to the reques- 
tor's address. 

(112) Results of the parallel address com- 
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pare indicate if no compare was found, 
which indicates that no stores exist that 
could interfere with the requested 
change of ownership. If any compare 
is found, the requestor must wait for all 5 
stores In the CPx stack to be com- 
pleted up to the captured INPTR be- 
fore the requested change of owner- 
ship can be made. 

(113) If any store compare is indicated by io 
step 12. the requestor must wait for 
the last store command to the ad- 
dressed data unit in the CPx stack to 
be completed in the 12 cache. See 
Fig. 7. 

(114) To determine when the last store to 
the addressed data unit has been com- 
pleted from the CPx stack, a pointer 
address compare is done between the 
captured INPTR and the incremented 
outpointer (OUTPTR), which deter- 
mines when all CPx store commands 
in the CPx store stack have been 
made into their addressed L2 cache 
data units. 

(115) The completion of the last possibly 
pertinent store in the stack Is indicated 
when the captured INPTR equals the 
incremented OUTPTR. Then it is safe 
to change ownership for the requested 
data unit, and any caused castout data 
unit Before the last store position is 
reached by the OUTPTR, the pointer 
compare outputs a signal D to set the 
store done latch 13 to its store-not- 
done state. 

(116) The requested ownership change is 
initiated by output E from the pointer 
compare resetting the store-done 
mode latch, which then provides the 
store-done signal F that provides 
cache data access 16 for the request- 
ed data unit 

(117) For a CORn type of requestor, the 
cache data access 16 provides the 45 
data unit for a castout to the next level 
storage, since all exclusive conflict has 
been resolved for the data unit. And 
CORn is released for another castout 

request. so Claims 

(118) For a CPz request, the data unit of 
cache data access 16 is sent to the 
CPz requestor, since any exclusive 
conflict has t>een resolved for the data 
unit And the CPz request register is 55 
released for another CPU request. 

(119) For an lOy request, the data unit of 
cache data access 16 is sent to the 



lOy requestor, since any exclusive 
conflict has been resolved. And the 
lOy request register is released for an- 
other 10 request to memory. 
The accessed directory entry is up- 
dated. For a CORn requestor, the entry 
is made available for requested new 
data. For an lOy fetch request, public 
ownership is set in the directory entry. 
For a CPz request, the requested ex- 
clusive or public ownership is set into 
the directory entry. 

CPx checks its XI queue 21 and finds 
an XI signal needing to be handled, 
and outgates the XI signal from XI 
queue 21 . See Rg. 8. 
CPx compares the address in the XI 
signal with the addresses of the store 
commands in its store queue 26. If no 
store command addresses in queue 26 
compares equal to the XI address then 
no store commands in queue 26 are 
marked. 

If any store commands compare equal 
with the XI signal, they are marked to 
indicate that data unit is no longer 
owned exclusive. CPx must re-acquire 
exclusive ownership of this data unit 
t^efore it can send the marked store 
commands to the store stack 27. 
The CPx LI cache directory is 
searched for the address with the XI 
signal, and any entry with that address 
is marked invalid, and an XI response 
signal is sent to the SCE. 
In CPx after the ownership is changed for the 
accessed data unit: 

(A) If any store is marked in the CPx store 
queue 26, its data unit is identified in the 
marked entry and it is refetched and re-ex- 
ecuted. 

(B) Continue normal processing in CPx. 
Thus, while the invention has been described 

with reference to preferred embodiments thereof, it 
will t>e understood by those skilled in the art that 
various changes in form and details may be made 
therein without departing from the spirit and scope 
of the invention. 



(124) 



Apparatus of ownership change control for a 
data unit in a cache of a data processing 
system, comprising: 

pipeline means for receiving and temporarily 
storing a plurality of store commands gen- 
erated by processor means in the system for 
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storing data in various data units in the cache; 

ownership relinquishing means for signalling a 
first processor means to relinquish ownership 
of a requested data unit when a second pro- 
cessor means requests ownership of the data 
unit cunrently owned by the first processor 
means for storing in the data unit; 

means for detecting if any store command by 
the first processor means for the requested 
data unit exists in the pipeline means; and 

ownership change means for immediately 
changing the ownership of the data unit in the 
cache if the detecting means detects no store 
command to the requested data unit in the 
pipeline means. 

2. Apparatus as defined in claim 1 , characterized 
by 

means for capturing an inpointer locating a 
store command last provided in the pipeline 
means by the first processor means when the 
first processor means sigrials a response to 
the signal from the ownership relinquishing 
means; 

means for comparing the inpointer with an 
outpointer that locates a store command in the 
pipeline means currently bemg outputted to 
the cache; and 

the ownership change means signalling when 
the inpointer equals the outpointer to indicate 
when the ownership of the requested data unit 
is to be changed if at least one store command 
to the requested data unit is in the pipeline 
means. 

a Apparatus as defined in claim 2, characterized 
by 

directory means for the cache in which each 
entry is capable of representing an associated 
data unit in the cache, and each entry indicat- 
ing an ownership for the associated data unit 
as exclusive to a processor means or as public 
to all processor means. 

4, Apparatus as defined in claim 3, characterized 
by 

a processor means identifier being provided in 
each entry to Indicate which of plural proces- 
sor means is a cunrent owner of the associated 
data unit if exclusive ownership is indicated. 



5- Apparatus as defined in claim 4, characterized 
by 

the ownership relinquishing means signalling 
5 all other processor means in the system with a 

general XI signal to relinquish ownership of a 
requested data unit currently indicated in a 
directory entry as l)eing publicly owned when 
a processor means requests exclusive owner- 
70 ship of the data unit. 

6. Apparatus as defined in claim 5, characterized 
by 

75 means for recognizing if a request for exclu- 

sive ownership is from a central processor 
means (CPU) and changing the processor 
means identifier to the identifier of the request- 
ing CPU in a directory entry accessed for a 

20 request by the CPU. 

7. Apparatus as defined in claim 5. characterized 
by 

25 means for recognizing if a store request is 

from an input/output (I/O) channel and sending 
a general XI signal to all CPUs in the system 
to relinquishing public ownership. 

30 8. Apparatus as defined in claim 5. characterized 
by 

means for recognizing if a read-only fetch re- 
quest is from any CPU or input/output (I/O) 
35 channel and allowing fetch access to the fetch 

request without sending any XI signal to any 
CPU if the indicated ownership is found to be 
public in a directory entry accessed by the 
CPU or (I/O) request. 

40 

9. Apparatus as defined in claim 4, characterized 
by 

the ownership relinquishing means signalling 
45 only to a processor means indicated in a direc- 

tory entry accessed by a processor means 
request as being the exclusive owner of the 
associated data unit, and means for updating 
the directory entry to indicate the exclusive 
50 ownership of the requesting processor means 

when the request is for exclusive ownership. 

10. Apparatus as defined in claim 4, characterized 
by 

55 

the ownership relinquishing means signalling 
only to a processor means indicated as being 
the exclusive owner of the associated data unit 
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in a directory entry accessed by a castout 
request; 

means for detecting if the directory entry in- 
dicates its associated data unit has been s 
changed; 

means for casting out the associated data unit; 
and 

10 

means for updating the directory entry after 
the castout to indicate the exclusive ownership 
of the requesting processor means when the 
ownership change means indicates the owner- 
ship is allowed to be changed. 75 



store into the associated data unit; 

pipeline stack means for storing a plurality of 
outstanding store commands between a CP to 
the cache; 

means for detecting if any store command 
exists in the pipeline stack means of a CP for 
a cache data unit requested by a CP with 
exclusive ownership; and 

means for changing the ownership of the re- 
quested data unit when all outstanding store 
commands are made in the requested data 
unit in the cache. 



11. Apparatus as defined in claim 4, characterized 
by 

the ownership relinquishing means signalling 
only to a processor means indicated as being 
the exclusive owner of the associated data unit 
in a directory entry accessed by a castout 
request; 

means for detecting if the directory entry in- 
dicates its associated data unit has not been 
changed; and 

means for updating the directory entry without 
any castout occurring to indicate the exclusive 
ownership of the requesting processor means 
when the ownership change means indicates 
the ownership is allowed to t>e changed. 

12. Ownership change control for a data unit in 
caches of a data processing system, char- 
acterized by 



13. Control means for a cache shared by more 
than one CPU in a multiprocessor (MP) system 
in which any CPU can do exclusive-stores to 

20 any location in the shared cache, replacement 

means for assigning cache locations to receive 
new CPU requests and for detenmining if an 
assigned location is a location requiring 
castout (CO) of contained data unit to a next 

25 level in a storage hierarchy, invalidation means 

associated with the cache for signalling invali- 
dation requests to a CPU to invalidate data unit 
privately stored by the CPU in a private loca- 
tion, characterized by 

30 

a plurality of CO registers for storing address- 
es of current shared cache locations deter- 
mined to be CO locations; 

35 a plurality of store stack means associated with 

the respective CPUs, each stack means receiv- 
ing a sequence of store addresses and store 
data units for a respective CPU; and 



a plurality of central processors (CP) in the 40 
system, each CP having a private cache (LI) 
and an LI directory in which each entry 
marked as valid is associated with a respective 
data unit in the LI cache located by using an 
address provided by the CP; 45 



means for capturing the stack address of the 
current store being received by the store stack 
means when the CPU signals a response to an 
invalidation request, the time of capture of the 
current store indicating it is a last store to the 
CO location. 



a shared directory shared by plural CPs in the 
system in which any CP can request either 
exclusive or public ownership of a data unit 
fetched into the cache, each entry having 50 
means for identifying a CPU owning the entry 
and its associated data unit when a flag field in 
the entry indicates an exclusive state although 
the flag field is also capable of indicating pub- 
lic ownership for the associated data unit, and 55 
a change field in the entry for indicating when 
the data unit has been changed in the cache, 
and only the owning CPU t>eing allowed to 



14. Control means as defined in claim 13 for con- 
trolling the cast-outs from a cache, character- 
ized by 

means for initiating the castout from the CO 
location when the time of capture is indicated. 

15- Control means as defined in claim 13 for co- 
ordinating an incomplete sequence of stores to 
a CO location in a cache when a CPU receives 
an invalidation signal, characterized by 
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means for detecting whether any entry exist in 
the store stack means for a CO location cur- 
rently in a CO register; and 

means tor enabling the castout from the CO s 
location in the cache whenever the detecting 
means indicates no stores exist in the store 
stack means for the CO location. 

16. Control means as defined in claim 13, char- w 
acterized by 

inpointer means for addressing a next location 
in the stack means to receive a store from the 
CPU. outpointer means for addressing the cur- 75 
rent location in the stack means for providing a 
CPU store from the stack means to a cache 
location; and 

the capture means storing the content of the 20 
inpointer means. 
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